Search Result

Select

Local feature point matching algorithm with anti-affine property

QIU Yunfei, LIU Xing

Journal of Computer Applications 2020, 40 (4): 1133-1137. DOI: 10.11772/j.issn.1001-9081.2019091588

Abstract （382）

PDF （635KB）（372）

Save

In order to solve the problems that the existing local feature matching algorithm has poor matching effect and high time cost on affine images,and RANdom SAmple Consensus(RANSAC)algorithm cannot obtain a good parameter model on affine image matching,Affine Accelerated KAZE(A-AKAZE)algorithm with anti-affine property was proposed and the vector field consistency was used to screen interior points. Firstly,the scale space was constructed by using the nonlinear function,then the feature points were detected by Hessian matrix,and the appropriate areas were selected as the feature sampling windows with the feature points as the centers. Secondly,the feature sampling windows were projected on longitude and latitude to simulate the influence of different angles on the image,and then the Affine Modified-Local Difference Binary(A-MLDB)descriptors with anti-affine property were extracted from the projection region. Finally,the interior points were extracted by the vector field consistency algorithm. Experimental results show that the correct matching rate of A-AKAZE algorithm is more than 20% higher than that of AKAZE algorithm,is about 15% higher than that of AKAZE+RANSAC algorithm,is about 10% higher than that of Affine Scale-Invariant Feature Transform(ASIFT)algorithm, and is 5% higher than that of ASIFT+RANSAC algorithm;at the same time,A-AKAZE algorithm has the matching speed much higher than AKAZE+RANSAC,ASIFT and ASIFT+RANSAC algorithms.

Reference | Related Articles | Metrics

Select

W-POS language model and its selecting and matching algorithms

QIU Yunfei, LIU Shixing, WEI Haichao, SHAO Liangshan

Journal of Computer Applications 2015, 35 (8): 2210-2214. DOI: 10.11772/j.issn.1001-9081.2015.08.2210

Abstract （406）

PDF （877KB）（304）

Save

n-grams language model aims to use text feature combined of some words to train classifier. But it contains many redundancy words, and a lot of sparse data will be generated when n-grams matches or quantifies the test data, which badly influences the classification precision and limites its application. Therefore, an improved language model named W-POS (Word-Parts of Speech) was proposed based on n-grams language model. After words segmentation, parts of speeches were used to replace the words that rarely appeared and were redundant, then the W-POS language model was composed of words and parts of speeches. The selection rules, selecting algorithm and matching algorithm of W-POS language model were also put forward. The experimental results in Fudan University Chinese Corpus and 20Newsgroups show that the W-POS language model can not only inherit the advantages of n-grams including reducing amount of features and carrying parts of semantics, but also overcome the shortages of producing large sparse data and containing redundancy words. The experiments also verify the effectiveness and feasibility of the selecting and matching algorithms.

Reference | Related Articles | Metrics

Select

Feature transfer weighting algorithm based on distribution and term frequency-inverse class frequency

QIU Yunfei, LIU Shixing, LIN Mingming, SHAO Liangshan

Journal of Computer Applications 2015, 35 (6): 1643-1648. DOI: 10.11772/j.issn.1001-9081.2015.06.1643

Abstract （460）

PDF （908KB）（342）

Save

Traditional machine learning faces a problem: when the training data and test data no longer obey the same distribution, the classifier trained by training data can't classify test data accurately. To solve this problem, according to the transfer learning principle, the features were weighted according to the improved distribution similarity of source domain and target domain's intersection features. The semantic similarity and Term Frequency-Inverse Class Frequency (TF-ICF) were used to weight non-intersection features in source domain. Lots of labeled source domain data and a little labeled target domain were used to obtain the required features for building text classifier quickly. The experimental results on test dataset 20Newsgroups and non-text dataset UCI show that feature transfer weighting algorithm based on distribution and TF-ICF can transfer and weight features rapidly while guaranteeing precision.

Reference | Related Articles | Metrics

Select

Consumption sentiment classification based on two-dimensional coordinate mapping method

LIN Mingming QIU Yunfei SHAO Liangshan

Journal of Computer Applications 2014, 34 (9): 2571-2576. DOI: 10.11772/j.issn.1001-9081.2014.09.2571

Abstract （188）

PDF （1043KB）（411）

Save

Aiming at the sentiment classification for Chinese consumption comments, a method called two-dimensional coordinate mapping for sentiment classification based on corpus was constructed. According to the Chinese language characteristics, firstly, a more pertinent searching method based on corpus was proposed. Secondly, the rules of extracting the Chinese subjective phrases were defined. Thirdly, the choosing optimal seed words algorithm of the specific field was constructed. Finally, the two-dimensional coordinate mapping algorithm was constructed, which mapped the comment in two-dimensional Cartesian coordinates through calculating the coordinate values of the comment and decided the semantic orientation of it. Experiments were conducted on 1200 comments of milk (half of them are positive or negative comments) in Amazon. In the experiments, word “henhao-lou” was chosen as the optimal seed word by using choosing optimal seed words algorithm, then the sentiment orientation of it was decided according to two-dimensional coordinate mapping algorithm. The average F-measure of the proposed algorithm reached more than 85%. The result shows that the proposed algorithm can classify the sentiment of Chinese consumption comments.

Reference | Related Articles | Metrics

Select

Microblog bursty topic detection based on topic tree

QIU Yunfei GUO Milun SHAO Liangshan

Journal of Computer Applications 2014, 34 (8): 2332-2335. DOI: 10.11772/j.issn.1001-9081.2014.08.2332

Abstract （258）

PDF （623KB）（385）

Save

A kind of topic tree detection method based on Latent Dirichlet Allocation (LDA) model was put forward, in order to solve the problems of nonstandard terms, randomness, uncertainty of reference and large number of network terms in microblog texts, which can not be solved in traditional detection method. Relevant microblogs were reorganized into a topic tree by increasing information entropy in Natural Language Processing (NLP), combining with the design idea that Dirichelet prior experience value α and experience value β vary with the topic number, then the contribution statistics of every word in the text was achieved using the specific dual probability statistical method of this model. Thus, the interference information would be disposed in advance and the influence of garbage data on topic detection was excluded. Using this contribution as the parameter value of the improved Vector Space Model (VSM), bursty topics were extracted through calculating the similarity between texts, in order to improve the detection precision of bursty topics. Experiments of the proposed detection method were made from two aspects: comparison of the value of F and the manual detection. The experimental data show that, this algorithm not only can detect the bursty topics, but also can improve the precision about 3% and 7% respectively compared with the HowNet model and the TF-IDF (Term Frequency-Inverse Document Frequency) algorithm, and it is more in accordance with human's logic judgments than the traditional ones.

Reference | Related Articles | Metrics

Select

Establishment and application of consumption sentiment ontology library based on three-dimensional coordinate

QIU Yunfei LIN Mingming SHAO Liangshan

Journal of Computer Applications 2013, 33 (09): 2540-2545. DOI: 10.11772/j.issn.1001-9081.2013.09.2540

Abstract （557）

PDF （925KB）（669）

Save

Since the positive comments may have the non-truly satisfied comments, a method which can truly reflect the sentiment of the consumers was constructed in order to decrease the non-truly satisfied comments from the positive comments. The research oriented to the consumption sentiment shows that the consumption sentiment vocabulary should be extracted at first. According to the consumption sentimental features, consumption sentiment got classified into seven classes and twenty-five subclasses, and the three-dimensional coordinate model was established. Afterwards, Protégé was used to build a consumption sentiment ontology library so that the consumption sentiment can be automatically classified by the three-dimensional coordinate vocabulary classification algorithm. Moreover, the consumption sentiment judging algorithm was applied to automatically judge consumer comments based on the completed library. Finally, compared with the positive comment ratio of Taobao, the F-measure can reach more than 95%. It can eliminate the non-truly satisfied comments from positive comments and reflect the consumer's real emotion.